Transposition Mechanism for Sparse Matrices on Vector Processors
نویسندگان
چکیده
Many scientific applications involve operations on sparse matrices. However, due to irregularities induced by the sparsity patterns, many operations on sparse matrices execute inefficiently on traditional scalar and vector architectures. To tackle this problem a scheme has been proposed consisting of two parts: (a) An extension to a vector architecture to support sparse matrix-vector multiplication using (b) a novel Blocked Based sparse matrix Compression Storage (BBCS) format. Within this context, in this paper we propose and describe a hardware mechanism for the extended vector architecture that performs the transposition AT of a sparse matrix A using a hierarchical variation of the aforementioned sparse matrix compression format. The proposed Sparse matrix Transposition Mechanism (STM) is used as a Functional Unit for a vector processor and requires an s s word in-processor memory where s is the vector processor’s section size. In this paper we provide a full description of the STM and show an expected performance increase of one order of magnitude. Keywords— Vector processor, matrix transpose, sparse matrix, functional unit
منابع مشابه
Optimizing Sparse Matrix - Vector Product Computations Using Unroll and Jam
Large-scale scientific applications frequently compute sparse matrix vector products in their computational core. For this reason, techniques for computing sparse matrix vector products efficiently on modern architectures are important. This paper describes a strategy for improving the performance of sparse matrix vector product computations using a loop transformation known as unroll-and-jam. ...
متن کاملBreaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors
The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...
متن کاملA Hierarchical Sparse Matrix Storage Format for Vector Processors
We describe and evaluate a Hierarchical Sparse Matrix (HiSM) storage format designed to be a unified format for sparse matrix applications on vector processors. The advantages that the format offers are low storage requirements, a flexible structure for element manipulations and allowing for efficient operations. To take full advantage of the format we also propose a vector architecture extensi...
متن کاملScalable Blas 2 and 3 Matrix Multiplication for Sparse Banded Matrices on Distributed Memory Mimd Machines
In this paper, we present two algorithms for sparse banded matrix-vector and sparse banded matrix-matrix product operations on distributed memory multiprocessor systems that support a mesh and ring interconnection topology. We aslo study the scalability of these two algorithms. We employ systolic type techniques to eliminate synchronization delay and minimize the communication overhead among pr...
متن کاملSparse Matrix Storage Format
Operations on Sparse Matrices are the key computational kernels in many scientific and engineering applications. They are characterized with poor substantiated performance. It is not uncommon for microprocessors to gain only 10-20% of their peak floating-point performance when doing sparse matrix computations even when special vector processors have been added as coprocessor facilities. In this...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001